Skip to content

fix(web): use session flavor label in voice context formatters#1

Closed
heavygee wants to merge 1 commit into
mainfrom
fix/voice-flavor-labels
Closed

fix(web): use session flavor label in voice context formatters#1
heavygee wants to merge 1 commit into
mainfrom
fix/voice-flavor-labels

Conversation

@heavygee

@heavygee heavygee commented May 24, 2026

Copy link
Copy Markdown
Owner

Summary

Replaces hardcoded Claude Code strings in voice context injections with the active session's flavor label (Cursor, Codex, Gemini, OpenCode, etc.) sourced from getFlavorLabel() in @hapi/protocol. Falls back to coding agent for unknown or missing flavors.

web/src/realtime/hooks/contextFormatters.ts — threads agentLabel (optional, default 'coding agent') through formatPlainText, formatPermissionRequest, formatMessage (text blocks + tool-call lines), formatNewMessages, formatHistory, formatSessionFull, formatReadyEvent.

web/src/realtime/hooks/voiceHooks.ts — adds getAgentLabel(session) helper using getFlavorLabel() from @hapi/protocol; passes label into all formatter calls.

Why shared/src/voice.ts is not touched

The system prompt in voice.ts contains example text that references "Claude" (e.g. the permission request example: "Claude wants to run a bash command"). We considered updating this but deliberately chose not to.

The argument for changing it: be as accurate as possible even in the ConvAI prompt.

The argument against, which won: once this PR (and tiann#682) are merged, every context update ConvAI receives will carry the correct agent label — "Cursor is requesting permission to use Bash", not "Claude Code is requesting...". At that point ConvAI's spoken responses will naturally reflect the right terminology, because LLMs follow the context they receive rather than parroting example text from a system prompt verbatim. The voice.ts wording only remains a real-world problem if these formatter fixes are not in place — which is an argument for merging this PR, not for also editing the system prompt. Touching system prompt wording without that being strictly necessary felt like stepping into editorial territory that belongs to the maintainers.

Happy to follow up separately on voice.ts if desired once these changes land.

Test plan

  • bun test web/src/realtime/hooks/contextFormatters.test.ts (8 tests — all assert label appears, none hardcode Claude Code)
  • Manual: voice session active, speech input and response verified end-to-end
  • Manual: ConvAI response correctly reflected active session context

Issues

Fixes tiann#680

@heavygee heavygee force-pushed the fix/voice-flavor-labels branch 3 times, most recently from e68ef31 to 8179263 Compare May 24, 2026 23:41
Replaces hardcoded 'Claude Code' strings in voice context injections with
the active session's flavor label (Cursor, Codex, Gemini, etc.) via
getFlavorLabel() from @hapi/protocol. Falls back to 'coding agent' for
unknown or missing flavors.

Threads an agentLabel param through formatMessage, formatPermissionRequest,
formatReadyEvent, formatNewMessages, formatHistory, and formatSessionFull.
voiceHooks resolves the label once per call via session.metadata.flavor.

Fixes tiann#680

via [HAPI](https://hapi.run)

Co-Authored-By: HAPI <noreply@hapi.run>
@heavygee heavygee force-pushed the fix/voice-flavor-labels branch from 8179263 to 9402685 Compare May 25, 2026 09:49
heavygee added a commit that referenced this pull request Jun 5, 2026
The native-fallback probe previously returned true whenever FCM was
configured AND devices were registered, which suppressed web-push for
the namespace. The HAPI Bot correctly pointed out the gap: if the FCM
pipeline silently breaks (expired service-account key, sustained 5xx,
OAuth token-fetch failure, network blackhole) the operator gets nothing
on either channel until they manually intervene.

Approach (deliberate, not the bot's exact suggested fix):

- FcmService now keeps a small rolling window (last 8 outcomes) of send
  attempts and exposes `isHealthy()`. The threshold is 5+/8 failures =
  unhealthy; the buffer starts empty so a freshly-booted hub is
  optimistic ("innocent until proven guilty") and does not double-fire
  on event #1.
- Token-fetch failure (`getFcmAccessToken` throws) now records exactly
  one health-failure (not one per device), short-circuits the send
  loop, and returns a result so `sendToNamespace` no longer leaks the
  exception.
- `invalid` token responses are explicitly excluded from the health
  buffer because they are per-device facts (rotated/uninstalled token),
  not pipeline failures - FCM was reachable, it just rejected one
  stale token.
- `buildNativeFallbackProbe` now optionally accepts the FcmService and
  short-circuits to "let web-push fire" when health is bad, before it
  even queries the device registry. The single-arg call shape is still
  supported for back-compat.

Why not the bot's exact suggestion ("invert: call FCM first, fall back
on result.sent === 0"):
- Couples PushNotificationChannel to FcmService and FcmSendPayload,
  reversing the clean parallel-channel architecture established earlier
  in this PR.
- Treats every transient single-event failure as fallback-worthy, which
  re-opens the duplicate-notification race that the suppression logic
  was added to close (FCM HTTP timeout that delivers later + the web
  push we sent in the meantime = two pings).
- A rolling health window only flips on sustained breakage, which is
  the actual operational scenario the bot is worried about.

The wrist-first design intent ("FCM fires unconditionally, web-push is
suppressed for the same namespace") documented in
docs/api/native-companion-contract.md is preserved on the happy path.
The probe only re-enables web-push when there is concrete evidence the
native pipeline is not delivering.

Tests:
- New FcmService.isHealthy suite covers empty-buffer, threshold flip,
  recovery as failures age out of the window, invalid-token exclusion,
  and network-error path.
- nativeFallbackProbe gains coverage for the unhealthy-but-registered,
  healthy-and-registered, and absent-fcmService (back-compat) cases.
- All 292 hub tests still pass; typecheck clean.

Co-authored-by: Cursor <cursoragent@cursor.com>
@heavygee

heavygee commented Jun 6, 2026

Copy link
Copy Markdown
Owner Author

Reopening as upstream PR against tiann/hapi; closing fork-tracking PR.

@heavygee heavygee closed this Jun 6, 2026
@heavygee heavygee deleted the fix/voice-flavor-labels branch June 6, 2026 11:58
heavygee added a commit that referenced this pull request Jun 10, 2026
- Web `CursorMigrationBanner` now renders a "Manual review needed"
  state for `cursorMigrationState === 'ambiguous'` (Major #1: caller
  was promoting the metadata flag but no UI surfaced it).
- Pin the md5-fixture contract for `workspaceHashFromPath`: raw,
  no-normalization, trailing-slash-distinct hashes computed via
  `printf '%s' <path> | md5sum` (Major #2: prevents algorithm drift
  that would silently revert path-priority discovery to fallback).
- Snapshot full candidate set BEFORE the canonical fast-path resolves
  a single drawer so the `migrator:transplanted` log reports the
  decision-time count, not a post-rm undercount (Minor #1).
- Warn log when canonical-path drawer is missing but readdir hands
  back exactly one candidate - regression-equivalent behaviour, but
  the size mismatch warrants a journalctl trail (path-normalization
  corner case the maintainer can grep for).
- Boundary test: `messageCount = 101` (first value above the skip
  threshold) engages the size sanity check, pinning the cutoff
  contract (Nit).
- Schema docstring on `cursorMigrationState` enum spelling out the
  banner contract per value (Nit).
- syncEngine `getHapiMessageCount` warn-logs `countMessages` throws
  instead of silently downgrading to 0 (would chronically disable
  the floor).

Drafted with claude-4.6-sonnet-thinking via Cursor; reviewed and
tested by the operator. tiann#873.

Co-authored-by: Cursor <cursoragent@cursor.com>
heavygee added a commit that referenced this pull request Jun 13, 2026
…dismissed (HAPI Bot, PR tiann#896)

The previous state machine swallowed the migration banner if the
operator reloaded the page before clicking dismiss: the migration flag
was set on success, and on remount the init logic mapped a
flag-set/dismiss-not-set session to 'pre-migrated', a state the banner
explicitly refuses to render. Net effect: a migrated session never
prompted for affirmative dismissal.

Fixes:

- Drop the 'pre-migrated' state. The dismissal flag is now the only
  signal that suppresses the banner; the migration flag alone means
  'banner shows until dismissed' (now or after a reload).
- Sessions that had nothing to migrate (no v1 entries in localStorage)
  pre-emptively write BOTH flags - migrated AND dismissed - so the bot's
  banner-stickiness fix doesn't surface a banner that has nothing to
  announce on freshly-created v2 sessions.

Tests:

- New `reload-before-dismiss leaves the banner visible` test pins the
  fix end-to-end: mount #1 migrates -> 'completed', unmount, mount #2
  on the same session reads the localStorage flags and stays
  'completed'.
- New `opts fresh sessions out of the banner pre-emptively` test pins
  the no-v1-entries shortcut.
- Existing `does not re-migrate on a mount where the migrated flag is
  already set` updated to assert 'completed' (not the dropped
  'pre-migrated').
- Existing `skips migration when localStorage is empty` updated to
  assert the new 'dismissed' status + the banner-dismissed flag.
- Banner test for the 'pre-migrated -> nothing' case removed (the state
  no longer exists).

Co-authored-by: Cursor <cursoragent@cursor.com>
heavygee added a commit that referenced this pull request Jun 16, 2026
…l converter

Lets us rehydrate the chats that only exist as recovered_agent_chats/*.md
exports (no live Cursor jsonl), so the existing backfill-agent-transcript.ts
+ HAPI spawn flow can ingest them.

Format the converter handles:

  # Conversation <uuid>
  ---
  ### 1
  <user text>
  --- TOOL CALLS ---
  [TOOL:N] args: {...}
  ---
  ### 2
  <agent text>
  ...

Notes:
- Normalises CRLF -> LF first (specstory exports are Windows-line-ended).
- Splits on '\n(?=### \\d+\\s)' so trailing whitespace variants all hit.
- Collapses contiguous same-N blocks (export sometimes repeats '### 2'
  when interleaving agent text with tool results).
- Role rule: block #1 = user, then alternating user/assistant by parity.
  Approximate but acceptable: a handful of misattributed roles in
  scrollback is fine; losing ~800KB of recovered context isn't.

Smoke-tested against recovered_agent_chats/2025-11-13_system_voice_ef65e0f1.md
(785KB in -> 569KB jsonl, 216 turns, 108 user / 108 assistant) - first user
turn round-trips verbatim and the full transcript backfilled into HAPI
sqlite without errors.

Co-authored-by: Cursor <cursoragent@cursor.com>
heavygee added a commit that referenced this pull request Jun 17, 2026
…dismissed (HAPI Bot, PR tiann#896)

The previous state machine swallowed the migration banner if the
operator reloaded the page before clicking dismiss: the migration flag
was set on success, and on remount the init logic mapped a
flag-set/dismiss-not-set session to 'pre-migrated', a state the banner
explicitly refuses to render. Net effect: a migrated session never
prompted for affirmative dismissal.

Fixes:

- Drop the 'pre-migrated' state. The dismissal flag is now the only
  signal that suppresses the banner; the migration flag alone means
  'banner shows until dismissed' (now or after a reload).
- Sessions that had nothing to migrate (no v1 entries in localStorage)
  pre-emptively write BOTH flags - migrated AND dismissed - so the bot's
  banner-stickiness fix doesn't surface a banner that has nothing to
  announce on freshly-created v2 sessions.

Tests:

- New `reload-before-dismiss leaves the banner visible` test pins the
  fix end-to-end: mount #1 migrates -> 'completed', unmount, mount #2
  on the same session reads the localStorage flags and stays
  'completed'.
- New `opts fresh sessions out of the banner pre-emptively` test pins
  the no-v1-entries shortcut.
- Existing `does not re-migrate on a mount where the migrated flag is
  already set` updated to assert 'completed' (not the dropped
  'pre-migrated').
- Existing `skips migration when localStorage is empty` updated to
  assert the new 'dismissed' status + the banner-dismissed flag.
- Banner test for the 'pre-migrated -> nothing' case removed (the state
  no longer exists).

Co-authored-by: Cursor <cursoragent@cursor.com>
heavygee added a commit that referenced this pull request Jun 17, 2026
The native-fallback probe previously returned true whenever FCM was
configured AND devices were registered, which suppressed web-push for
the namespace. The HAPI Bot correctly pointed out the gap: if the FCM
pipeline silently breaks (expired service-account key, sustained 5xx,
OAuth token-fetch failure, network blackhole) the operator gets nothing
on either channel until they manually intervene.

Approach (deliberate, not the bot's exact suggested fix):

- FcmService now keeps a small rolling window (last 8 outcomes) of send
  attempts and exposes `isHealthy()`. The threshold is 5+/8 failures =
  unhealthy; the buffer starts empty so a freshly-booted hub is
  optimistic ("innocent until proven guilty") and does not double-fire
  on event #1.
- Token-fetch failure (`getFcmAccessToken` throws) now records exactly
  one health-failure (not one per device), short-circuits the send
  loop, and returns a result so `sendToNamespace` no longer leaks the
  exception.
- `invalid` token responses are explicitly excluded from the health
  buffer because they are per-device facts (rotated/uninstalled token),
  not pipeline failures - FCM was reachable, it just rejected one
  stale token.
- `buildNativeFallbackProbe` now optionally accepts the FcmService and
  short-circuits to "let web-push fire" when health is bad, before it
  even queries the device registry. The single-arg call shape is still
  supported for back-compat.

Why not the bot's exact suggestion ("invert: call FCM first, fall back
on result.sent === 0"):
- Couples PushNotificationChannel to FcmService and FcmSendPayload,
  reversing the clean parallel-channel architecture established earlier
  in this PR.
- Treats every transient single-event failure as fallback-worthy, which
  re-opens the duplicate-notification race that the suppression logic
  was added to close (FCM HTTP timeout that delivers later + the web
  push we sent in the meantime = two pings).
- A rolling health window only flips on sustained breakage, which is
  the actual operational scenario the bot is worried about.

The wrist-first design intent ("FCM fires unconditionally, web-push is
suppressed for the same namespace") documented in
docs/api/native-companion-contract.md is preserved on the happy path.
The probe only re-enables web-push when there is concrete evidence the
native pipeline is not delivering.

Tests:
- New FcmService.isHealthy suite covers empty-buffer, threshold flip,
  recovery as failures age out of the window, invalid-token exclusion,
  and network-error path.
- nativeFallbackProbe gains coverage for the unhealthy-but-registered,
  healthy-and-registered, and absent-fcmService (back-compat) cases.
- All 292 hub tests still pass; typecheck clean.

Co-authored-by: Cursor <cursoragent@cursor.com>
heavygee pushed a commit that referenced this pull request Jun 18, 2026
* docs: spec for hapi-pi-agent-backend

* docs: spec retrospect for hapi-pi-agent-backend

* docs: plan for hapi-pi-agent-backend

* docs: plan retrospect for hapi-pi-agent-backend

* feat(pi): add hapi pi command with JSONL transport and event converter

- PiTransport: spawn pi --mode rpc, JSONL stdio, ENOENT/EPIPE handling
- PiEventConverter: Pi AgentEvent → HAPI AgentMessage conversion
- runPi: session lifecycle, dual-track event routing, model switching
- pi command: CLI registration with PI_PERMISSION_MODES
- Shared: add 'pi' to AGENT_FLAVORS, FLAVOR_CAPS, FLAVOR_LABELS

30 tests passing (15 transport + 15 converter)

* fix(pi): add Pi RPC types, fix double-cleanup/double-start/converter safety net

- Add cli/src/pi/types.ts with PiAgentEvent/PiResponseEvent discriminated unions
- PiTransport: constructor uses options object, double-start guard, drop log
- PiEventConverter: typed events via type assertions, top-level try/catch
- runPi: safeCleanup guard prevents double-cleanup race, sendAgentMessage
  for converted events, keepAlive() for session pings
- 33 tests passing

* docs: dev phase reviews and test results for hapi-pi-agent-backend

- Business logic review: pass (0 must_fix)
- Standards review: pass (0 must_fix)
- Taste review: P0 types issue fixed in code
- Robustness review v2: pass (v1 3 MUST_FIX all fixed)
- Integration review: pass (0 must_fix)
- Test results: 33 passing, all type errors resolved

* docs: taste review v2 pass after type definition fixes

* docs: dev retrospect for hapi-pi-agent-backend

* test: test execution for hapi-pi-agent-backend (20/20 pass)

* fix: add taste_review symlink for gate pattern match

* docs: test retrospect for hapi-pi-agent-backend

* fix(web): add pi to MODEL_OPTIONS Record type

* ci: PR and CI evidence for hapi-pi-agent-backend

* docs: overall retrospect for hapi-pi-agent-backend (all 5 phases)

* test(pi): add buffer split, missing fields, and handleResponse tests

- PiTransport: buffer cross-chunk reassembly test
- PiEventConverter: tool_execution_end with missing result/toolCallId
- handleResponse: 10 tests covering all branches (error, get_state,
  set_model, new_session, abort, prompt, unknown command)
- Extract handleResponse to accept onUpdate callback for testability
- Total: 46 tests passing (was 33)

* fix(pi): set requiresRuntimeAssets to false — pi runs as subprocess, no native tools needed

* refactor(cli): lazy import ensureRuntimeAssets to reduce startup overhead

* docs: add 15 manual E2E protocol test cases (TC-4-xx) based on real Pi RPC capture

- TC-4-01 to TC-4-15: manual tests covering tool execution, thinking
  lifecycle, multi-turn, abort, error scenarios, model switch, cleanup
- Priority: P0 (tool fields, failure, thinking, multi-turn, abort)
  > P1 (basic conversation, write tool, model switch, usage) > P2 (edge cases)
- Includes actual Pi RPC event sequence from live capture as reference
- e2e-test-plan.md updated with test environment setup instructions
- Total test cases: 35 (6 unit + 14 integration + 15 manual)

* test: E2E protocol test results for hapi-pi (11/15 pass)

P0/P1 automated tests (8/8 pass):
- TC-4-01: Basic text conversation ✓
- TC-4-02: Tool read (field names verified) ✓
- TC-4-03: Tool write (file created) ✓
- TC-4-04: Tool failure (isError=true) ✓
- TC-4-05: Thinking lifecycle + usage ✓
- TC-4-06: Multi-turn context retention ✓
- TC-4-07: Abort generation ✓
- TC-4-14: Token count ✓
- TC-4-15: Extension UI events ignored ✓

P2 results:
- TC-4-10: Invalid token → 401 ✓
- TC-4-12: Ctrl+C cleanup, no orphans ✓
- TC-4-08: ENOENT (harness issue, exit code correct)
- TC-4-11: set_model not supported by Pi (success=false)
- TC-4-13: Pi crash (harness output capture issue)

* test: fix TC-4-11 result — Pi set_model works with correct provider/modelId

Previous test used invalid provider='' + modelId='deepseek-chat'.
Re-tested with provider='deepseek' + modelId='deepseek-v4-flash':
- set_model success=true
- model switched glm-5.1 → deepseek-v4-flash
- subsequent prompt confirmed working

Final E2E results: 12/15 PASS, 2 FAIL (test harness), 1 SKIP

* chore: remove .xyz-harness/ from git tracking, add to .gitignore

Local harness workflow artifacts should not be tracked in the repo.

* fix(pi): resolve web UI bugs for hapi-pi integration

Five bugs fixed for end-to-end pi session via hapi web UI:

1. runner buildCliArgs: add 'pi' branch to spawn correct command
   (was falling back to 'claude', launching wrong agent)
2. runPi: implement real keep-alive (2s interval) to prevent hub
   30s timeout marking session inactive
3. runPi: bump keep-alive to active state during agent/turn_start
4. sessionResume: add 'pi' to flavor switch and resume condition
   (was returning undefined, causing 'cannotResume' on inactive session)
5. PiEventConverter: emit codex-compatible {type:'message',message:...}
   /{type:'reasoning',message:...} with streamId; dedup by skipping
   text_start/text_end (only send deltas) to avoid triple-rendered text
6. PiTransport: fallback to stdout 'end' event when child process
   close event doesn't fire (bun spawn quirk)

Verified end-to-end: web UI shows pi reasoning + reply correctly,
session stays online, no duplicate text.

* fix(pi): address 4 web UI display bugs in hapi-pi integration

Three of four follow-up bugs reported after the initial fix (6c28949):

1. Stuck in 'queued' status — fix
   Pi's runner doesn't use MessageQueue2, so the base session's
   onBatchConsumed hook never fires. Add a FIFO of pending localIds
   in runPi and emit messages-consumed on agent_start. turn_start
   is intentionally skipped (it can fire multiple times per agent
   run after tool calls). A prompt rejection from Pi also consumes
   the localId so the next prompt isn't poisoned.

2. AI thinking only displays ':' — fix
   Pi emits pure incremental deltas (text_delta / thinking_delta)
   per token. The web reducer dedupes reasoning by streamId WITHIN
   one message's content array only — separate wire messages
   produce separate renders. Without accumulation, 50 deltas = 50
   reasoning renders, of which the reducer keeps only the last
   delta (a single character like ':').

3. Output text on separate lines — fix
   Same root cause as #2 but for text: the reducer appends each
   text AgentMessage as a new agent-text block (no dedup), so 50
   deltas become a 50-row character-by-character column.

4. Tool call execution status (in_progress -> completed)
   The tool result wire CodexMessage type is 'tool-call-result'
   (with callId + is_error?); the internal AgentMessage 'tool_result'
   is converted to that. Status mapping is preserved.

Implementation: extract a PiMessageAccumulator class (testable in
isolation) that mirrors codex's ReasoningProcessor pattern:
- message_start resets state and streamId
- text_delta / thinking_delta append to internal text / reasoning
- text_start/thinking_start/text_end/thinking_end ignored (they
  carry full partial state — would duplicate)
- message_end flushes (max 1 reasoning + 1 text message, in order)
- turn_end safety net flushes if active
- flushIfActive() exposed for transport close / crash

The converter now routes AgentMessage through convertAgentMessage
so the wire format is codex-shaped (matches opencode/gemini/kimi
path). AgentMessage 'text' and CodexMessage 'message' both gain
optional id; convertAgentMessage preserves caller-provided id for
streamId-based dedup on the web side.

Tests: 16 new PiMessageAccumulator tests + 5 updated
PiEventConverter tests + 4 messageConverter tests, all passing.
Full suite: 909/910 (1 unrelated macOS path normalization). tsc
clean.

* fix(pi): review round 1 - 1 must-fix issue

The web session-resume helper referenced metadata.piSessionId, but the
shared MetadataSchema does not define the field, and the back-end has no
path to populate it (Pi session resume is out of scope per spec.md).
This caused web typecheck to fail and would also have produced a
runtime 'resume_unavailable' from the hub if a user tried to resume a Pi
session that had any user messages (the stale 'flavor === pi' branch in
inactiveSessionCanResume claimed resume was supported).

Revert the two early Pi branches from the web resume helper. Add a
comment pointing at the spec and noting what to undo when back-end
resume ships (re-add 'case pi' + 'piSessionId' on MetadataSchema +
extend hub resolveAgentResumeId).

* fix(pi): review round 2 - 4 must-fix issues

1. cli/src/runner/run.ts buildCliArgs: stop forwarding --resume to the pi
   binary. Pi session resume is out of scope (no piSessionId on
   Metadata), so forwarding would create an orphan session the hub can't
   track. Hub already returns null from resolveAgentResumeId for
   flavor='pi' and falls through to fresh spawn; this just hardens the
   runner layer to match.

2. cli/src/pi/runPi.ts: cache currentProvider from get_state and use it
   for subsequent set_model RPCs. Pi's set_model requires both provider
   and modelId, but the bootstrap-time code emitted provider: '' which
   Pi rejects. The bootstrap-time model is still applied by Pi at
   startup, so suppressing set_model until get_state arrives is a no-op
   for same-model configs rather than a wrong-model emit.

3. web/src/components/AssistantChat/modelOptions.ts: add explicit pi
   branches to getModelOptionsForFlavor and getNextModelForFlavor.
   Without them, Pi sessions fell through to the Claude preset cycler,
   which would push sonnet/opus ids into a Pi session via
   set-session-config. Mirrors the opencode handling introduced earlier.

Tests added/updated: buildCliArgs covers pi + claude resume; handleResponse
mirror test covers provider caching; modelOptions tests cover pi
no-fallback behavior for both option list and cycler.

* fix(pi): add session resume support and fix review issues

- Add piSessionId to MetadataSchema (shared/src/schemas.ts)
- Persist piSessionId from get_state response to metadata (cli/src/pi/runPi.ts)
- Pass --session-id to Pi spawn on resume (cli/src/pi/runPi.ts)
- Add pi branch to resolveAgentResumeId (hub/src/sync/syncEngine.ts)
- Add case 'pi' to resolveAgentSessionIdFromMetadata (web/src/lib/sessionResume.ts)
- Replace pi resume skip guard with --session-id forwarding (cli/src/runner/run.ts)
- Preserve piSessionId in pickExistingSessionMetadata (cli/src/agent/sessionFactory.ts)
- Add pi badge to AgentFlavorIcon (web/src/components/AgentFlavorIcon.tsx)
- Fix transport.onClose crash-marking on normal shutdown (cli/src/pi/runPi.ts)

* fix(pi): review round 1 - 3 must-fix issues

- resume.ts: add pi branch to dispatchLocalResume() so hapi resume
  dispatches to runPi instead of falling through to cursor
- runPi.ts: accept existingSessionId and use bootstrapExistingSession
  when resuming, matching other agents' pattern
- agentCommandOptions.ts: parse --session-id in addition to --resume
  so runner-spawned pi resume actually forwards the session ID
- types.ts: export PiPermissionMode alongside other agent permission
  mode types for consistent import convention

* fix(pi): review round 2 - 2 must-fix issues

* refactor(workflow): improve pi-adaptation-review-loop robustness

- Switch from structured output to file-based JSON output for reliability
- Replace per-round file limit (20→30) with clear wording (remove misleading split-commits instruction)
- Return { data, error } from readResultFile() to surface parse/validation failures in abortReason
- Fix lastMustFix sentinel: initialize to null, use ?? for explicit N/A reporting
- Add getAgentDirs() to dynamically discover agent dirs from cli/src/
- Document rollbackTo() atomic-round design intent
- Add isValidIssue() validation, runFinalCleanup() helper, git repo pre-check

* test(pi): add coverage for pi flavor across shared, cli, and web

- shared/flavors.test.ts: pi/kimi capability, label, known, supports
- shared/modes.test.ts: PI_PERMISSION_MODES contract, per-mode checks
  (7-mode allowed/denied matrix)
- web/AssistantChat/modelOptions.test.ts: pi shortcut vs Claude
  cycler, normalize filter (auto/default/whitespace), kimi/cursor/
  opencode cross-flavor consistency
- web/lib/sessionResume.test.ts: piSessionId resolver, cross-flavor
  stale-id protection, inactiveSessionCanResume for pi, regression
  coverage for all 6 other flavors
- web/components/AgentFlavorIcon.test.tsx: pi badge styling
  (bg-[#5b21b6]), Un fallback, case/whitespace normalize,
  className override
- cli/commands/agentCommandOptions.test.ts: --session-id
  (pi-specific flag), --resume alias, PI mode validation,
  --yolo vs explicit-mode priority

137 new test cases, all passing. Full suite: 96 files / 933 tests
green (unrelated apiMachine.test.ts macOS /private/var path issue
remains as documented in handoff).

* feat(pi): implement P0 — context budget bar + dynamic model discovery

P0-1: Context Budget Bar
- Add pi branch to modelConfig.ts getContextBudgetTokens()
- Conservative 200K default context window for Pi sessions

P0-2: CLI-side model discovery
- Add get_available_models to PiRpcCommand type
- Auto-send get_available_models after get_state in runPi.ts
- Cache model list and push to session metadata
- Register ListPiModels RPC handler with promise-based transport query

P0-3: Hub-side routing
- Add listPiModelsForSession to rpcGateway and syncEngine
- Add REST endpoint GET /sessions/:id/pi-models (pi sessions only)

P0-4: Web-side rendering
- Add PiModelSummary type to shared apiTypes
- Add usePiModels hook (TanStack Query, stale 60s)
- Add getSessionPiModels to API client
- Add sessionPiModels query key
- Wire piModelOptions into SessionChat availableModelOptions
- Model dropdown renders discovered models or falls back to Default

* fix(pi): address code review findings + pre-existing test issue

Review fixes:
- Fix race condition in sendPiRpcAndWait: use incremental id as key
  instead of command type, preventing resolver overwrite on concurrent
  calls (e.g. auto-discovery + ListPiModels RPC)
- Extract parsePiModels() to eliminate duplicated model parsing logic
  between handleResponse and ListPiModels RPC handler (DRY)
- Add resolvePendingRpc() call in error response path to prevent
  promise leaks when Pi rejects an RPC with an id
- Add piModelsState.error guard to onModelChange in SessionChat,
  matching the pattern used by codex and cursor flavors

Pre-existing fix:
- Fix apiMachine.test.ts symlink assertion on macOS (/var vs
  /private/var) by applying realpathSync to the expected path

* feat(pi): P1 — session rename sync, thinking level UI, skills/commands

P1-1: Session Rename → Pi notification
- Add set_session_name to PiRpcCommand
- Register RenamePiSession RPC handler in CLI
- Hub syncEngine.renameSession now forwards to Pi CLI for active sessions
- Hub rpcGateway + REST endpoint added

P1-2: Thinking Level support
- Add Pi thinking level constants to shared/src/piThinkingLevel.ts
  (off/minimal/low/medium/high/xhigh)
- Add ThinkingLevel capability to Pi flavor in flavors.ts
- sessionConfigRpc now supports effortMode for Pi thinking level
- runPi captures thinkingLevel from get_state and forwards via
  set_thinking_level
- Hub effort endpoint accepts pi sessions (was claude-only)
- Web: piThinkingLevelOptions.ts + HappyComposer renders Pi options
  when flavor=pi

P1-3: Skills/Commands discovery
- Add get_commands to PiRpcCommand, auto-discover after get_state
- Register ListPiCommands + ListSlashCommands RPC handlers in CLI
  (maps Pi commands to HAPI SlashCommand format)
- Hub: listPiCommandsForSession + REST GET /sessions/:id/pi-commands
- Web: usePiCommands hook + api client + query keys

Also fixes:
- Pre-existing ZodError.errors → ZodError.issues in hub/socket/server.ts
- Updated test expectation for effort endpoint error message

* feat(pi): implement P2 features — steer, queue modes, history, native images

P2-1: Steer/Follow-up
- Track piIsStreaming state from agent_start/turn_start/turn_end/agent_end
- When streaming, onUserMessage sends steer instead of prompt
- Added PiSteer/PiFollowUp RPC methods + hub routing + REST endpoints

P2-2: Queue modes
- Added set_steering_mode/set_follow_up_mode to PiRpcCommand
- CLI RPC handlers with mode state tracking
- Hub routing + REST POST endpoints
- Web API client methods

P2-3: History replay
- Added get_messages to PiRpcCommand
- CLI handler converts Pi AgentMessage to PiMessageEntry format
- Hub RPC routing + REST GET /sessions/:id/pi-messages
- Web usePiMessages hook + query key

P2-4: Native image passing
- Added PiImageContent type for base64 image data
- extractPiImages() helper reads attachment files as base64
- prompt/steer commands now include images field
- Falls back to @path text reference for non-image/unreadable files

* feat(pi): implement P3 advanced features — compact, fork, clone, switch, stats, export

P3 features for Pi agent integration:

- Compact: compact RPC with custom instructions, set_auto_compaction toggle
- Fork: fork at entry ID, get_fork_messages for fork context
- Clone: clone current Pi session
- Switch Session: switch Pi to a different session by path
- Session Stats: get token counts, message counts, cost
- HTML Export: export session as HTML file

All features follow existing P2 pattern:
- CLI: RPC handlers in runPi.ts with sendPiRpcAndWait
- Hub: rpcGateway + syncEngine routing + REST endpoints
- Web: API client methods + query keys + type exports + hooks (stats, fork messages)

Total: 8 new REST endpoints, 9 RPC handlers, 6 web API methods
Typecheck: all 3 packages pass (cli+hub+web)
Tests: 1155 pass (263 hub + 803 web + 89 shared), 0 failures

* refactor(pi): clean up runPi.ts imports and readability

- Replace require('fs') with top-level import { readFileSync } from 'fs'
- Extract handleGetState() as standalone function from handleResponse
  switch case (get_state case: 35 lines → 4 lines dispatch)

Typecheck: all 3 packages pass
Tests: 1066 pass (263 hub + 803 web), 0 failures

* fix(pi): remove native image passing, fix version pollution

- Remove extractPiImages helper and PiImageContent type: all
  attachments now use @path text references via
  formatMessageWithAttachments, consistent with every other agent
- Remove images field from prompt/steer/follow_up RPC commands
- Remove unused readFileSync import
- Restore cli/package.json version from test pollution
  (0.0.0-integration-test-should-be-auto-cleaned-up-51369 → 0.20.0)

Typecheck: all 3 packages pass
Tests: 1286 pass, 0 failures

* refactor(pi): extract hub helper, unify web hooks, fix import style

- Hub: extract withPiSession helper eliminating boilerplate across 15
  Pi REST endpoints (~400 lines → ~150 lines)
- Web: unify usePiForkMessages and usePiSessionStats to return
  destructured typed fields matching usePiModels/usePiCommands pattern
- Web: move 15 Pi response types from inline import() to top-level
  named imports in api/client.ts
- CLI: remove duplicate PiCommandSummary/PiCommandsResponse from
  types.ts, re-export from @hapi/protocol/apiTypes

Typecheck: all 3 packages pass
Tests: 1286 pass, 0 failures

* chore: untrack .agents/skills and .pi, fix .xyz-harness in gitignore

* refactor: remove unused text message id from converter layer, update gitignore

* fix: update tests for pi resume support and text id removal

* fix: restore cursor resume branch in buildCliArgs

* refactor: remove pi-specific rename from syncEngine, align with other agents

* refactor: remove effort field from sessionConfigRpc, Pi self-handles RPC

Pi agent now self-handles SetSessionConfig RPC (like Claude) using
the existing  field, instead of adding a parallel
field to the shared sessionConfigRpc helper which only knows about
.

- Remove effort/effortMode from sessionConfigRpc types and logic
- runPi.ts: self-register RPC handler with PiThinkingLevel validation
- Reuse resolveSessionConfigPermissionMode from sessionConfigRpc

* refactor: consolidate Pi RPC layer from 36 methods to 3 generics

rpcGateway: 12 methods → callPiRpc<T>
syncEngine: 12 passthroughs → callPiRpc<T> delegate
web client: 12 methods → callPiEndpoint<T>
routes: use engine.callPiRpc with RPC_METHODS constants
hooks: use callPiEndpoint, add missing type imports

* chore: revert unrelated apiMachine test change

* refactor: remove unused ThinkingLevel capability from flavors

Pi's thinking level is an effort variant, not a separate capability.
The ThinkingLevel constant and supportsThinkingLevel() had zero callers
— the frontend uses flavor-based branching for effort option rendering.

* refactor: drop Pi prefix from generic RPC method names

* refactor: remove 13 Pi RPC methods with no UI consumers

Steer: already handled by onUserMessage auto-routing
Follow-up: redundant with HAPI message queue
ListPiCommands/GetMessages/ForkMessages/SessionStats: no UI
Compact/SetAutoCompaction/Fork/Clone/SwitchSession/ExportHtml: no UI
SetSteeringMode/SetFollowUpMode: no UI

Kept: ListPiModels (has UI), SetSessionConfig, ListSlashCommands, Abort, Switch
Deleted: 4 web hooks, 13 RPC handlers, 12 REST routes, 13 rpcMethods entries
Net: -730 lines

* refactor: extract session.ts and loop.ts from runPi.ts

Restructure Pi agent following Codex pattern (without Local/Remote
splitting since Pi only has remote mode):

- session.ts: PiSession class managing state + hub communication
- loop.ts: response parsing, RPC resolver, transport event wiring
- runPi.ts: thin entry (bootstrap, RPC handlers, lifecycle)

Changes from review:
- Encapsulate RPC resolver in PiRpcResolver class (session-scoped,
  not module-level singleton)
- Remove unused extractTextFromPiMessage export
- Fix inline import('./types') → top-level import

* refactor: normalize Pi file naming and improve test coverage

- Rename PiTransport.ts → piTransport.ts, PiEventConverter.ts →
  piEventConverter.ts, PiMessageAccumulator.ts → piMessageAccumulator.ts
  (match project-wide camelCase convention)
- Delete handleResponse.test.ts (tested stale copy of inline function)
- Add loop.test.ts with 20 tests covering parsePiModels,
  parsePiCommands, wireTransportEvents integration, and sendPiRpcAndWait
- Total Pi tests: 73 (was 53)

* test: add E2E harness with 4 core helpers and integration specs

Helper functions in e2e/harness.ts capture the four non-obvious
interactions discovered during the 2026-06-09 retest:
- longPress: SessionActionMenu is triggered by 500ms press, not click
- mockOffline: useOnlineStatus hook listens to navigator.onLine +
  window offline event, not CDP Network.emulateNetworkConditions
- pollForText: thinking indicator flickers in <1s, 3s polling misses
- isVisible: element.offsetParent returns null for position:fixed
  dialogs even when visible; use getBoundingClientRect

Plus Chrome lifecycle (startChrome/stopChrome, never pkill chrome)
and hub API helpers (loginWithToken, listSessions).

5 integration specs (e2e/integration/) cover:
- yolo-permission: toggle + localStorage persistence (4 cases)
- codex-dialog: pre-flight check + dialog render (3 cases)
- stress: 10 concurrent + invalid JWT + malformed + unknown
  endpoint (5 cases, all PASS)

All 12 integration cases pass. Full E2E results in
.xzy-harness/2026-06-09-full-e2e-retest/ (67 cases, 0 functional
bugs found).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: resolve Pi model selection and thinking level issues

- Fix PiModelPanel: use provider+modelId composite for selection check
  and React key, preventing duplicate highlights for same-name models
  across different providers
- Fix PiThinkingLevelPanel: unify thinkingLevelMap filtering logic by
  extracting shared isThinkingLevelSupported utility
- Fix HappyComposer: auto-reset effort to highest supported level when
  switching models, update label to reflect effective level

* refactor: remove 29 dead exports from feat-pi-support

Remove unused types, methods, and re-exports identified by dead code audit:

shared/src/apiTypes.ts (19):
- SessionModelIdentifier, ListPiCommandsResponse
- PiSteeringMode, PiFollowUpMode, PiSteerResponse, PiFollowUpResponse
- PiQueueModeResponse, PiMessageEntry, PiMessagesResponse
- PiCompactResponse, PiSetAutoCompactionResponse
- PiForkResponse, PiForkMessageEntry, PiForkMessagesResponse
- PiCloneResponse, PiSwitchSessionResponse
- PiSessionStats, PiSessionStatsResponse, PiExportHtmlResponse

cli/src/pi/types.ts (6):
- PiSessionStats, PiCompactionResult, PiForkMessageEntry (dead local duplicates)
- PiCommandsResponse, PI_THINKING_LEVELS, PI_THINKING_LEVEL_LABELS (dead re-exports)

cli/src/pi/piMessageAccumulator.ts (1):
- flushIfActive() method (comment claimed runPi calls it, but it doesn't)

cli/src/pi/piTransport.ts (1):
- isRunning() method (never called in production code)

web/ (2):
- ProviderGroup, PiThinkingLevelOption (unnecessary exports, made local)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix: resolve 7 PR review issues in Pi support

#3 Remove duplicated PI_THINKING_LEVELS in schemas.ts, import from @hapi/protocol
#2 Add piAvailableModels field to MetadataSchema (schema-runtime consistency)
#6 Replace hardcoded flavor names with supportsEffort() in effort route
#1 Move PiRpcResolver from module-level singleton to PiSession instance
#4 Add piCachedModels fallback in piModelOptions useMemo
#7 Merge message_update dead branch into unified not-converted case
#10 Fix misleading Pi model list comments in modelOptions.ts

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix: normalize Pi model object to string in hub sessionCache (#5), remove extra blank line in rpcGateway (#8)

#5: applySessionConfig now extracts modelId from { provider, modelId }
    before passing to setSessionModel / session.model, preventing
    [object Object] from being stored in SQLite when Pi switches models.

#8: Remove double blank line before RpcGateway class declaration.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(pi): preserve piAvailableModels on resume, document SetSessionConfig divergence

- sessionFactory: preserve piAvailableModels in pickExistingSessionMetadata
  so web shows cached models on inactive-session view without RPC round-trip
- sessionConfigRpc: extend resolveNullableSessionModel to accept
  {provider, modelId} object form for schema consistency
- runPi: document why Pi manually registers SetSessionConfig instead of
  reusing registerSessionConfigRpc (wire protocol needs separate fields)
- package.json: restore version to 0.20.0

* refactor: remove unused Pi types, extract JsonLineParser, clean up review findings

- Remove 13 unused PiRpcCommand variants and PiStreamingBehavior type (YAGNI)
- Remove unnecessary exports on 3 internal Zod schemas in pi/schemas.ts
- Extract JsonLineParser base class to utils/, shared by PiTransport,
  CodexAppServerClient, and AcpStdioTransport (eliminates 3x duplicate
  handleStdout buffer logic)
- Remove DEV-only duplicate session ID detection from SessionList.tsx
  (debug code unrelated to Pi support scope)
- Add comments explaining key prefix rationale in SessionChat.tsx

* chore: remove unrelated E2E test harness from Pi support PR

E2E harness (codex-dialog, stress, yolo-permission, scratchlist specs)
was introduced in this branch but tests generic HAPI behavior unrelated
to Pi agent support. Should live in a separate PR.

* fix: wrap cursor model change handler for union type compatibility

* fix: apply startup --model to Pi and remove duplicate lockfile entry

1. --model startup bug:
   - Add initialModel to PiSession to preserve startup model
   - handleGetState preserves initialModel instead of overwriting with Pi default
   - get_available_models handler resolves provider from cached models and sends set_model

2. bun.lock duplicate key:
   - Remove duplicate @twsxtd/hapi-win32-x64@0.20.0 entry
   - Fixes CI lockfile regeneration that caused hono type errors

* fix: update test expectation for effort endpoint error message

* fix(pi): resolve 8 link-review defects + abort session termination

- W1C-D-1: hasSameAgentSessionIds missing piSessionId/kimiSessionId
  + extractAgentSessionId also needs piSessionId recognition
- D-1: dispatchLocalResume pi branch missing effort param
- W1B-1-01: buildCliArgs only passes --effort for claude, not pi
- W2B-D-2: effort=null does not send set_thinking_level to Pi
- D-3: turn_start does not consume pendingLocalIds
- D-7: keep_alive falls into default case in convertPiEvent
- D-9: finally overwrites sessionEndReason set by Switch/Abort
- W2B-D-3: ListPiModels RPC does not update metadata
- Abort handler: remove cleanupAndExit, only cancel current turn

Also: Switch handler returns { success: true } for consistency

Test coverage: 13 new test cases across 5 files

* fix: restore cli version from integration test placeholder

* fix(pi): send restored thinking level to Pi subprocess on startup

opts.effort was stored in piSession.currentThinkingLevel but never
forwarded via set_thinking_level during the startup sequence, causing
runner-spawned and resumed sessions to show the restored effort in
HAPI while Pi kept its default.

* fix: restore cli package version from integration test residue

* fix(pi): switch-to-remote handler preserves session instead of terminating

Replace lifecycle.cleanupAndExit() with createModeChangeHandler + keepAlive
in the Switch RPC handler. Pi runs as a single long-lived subprocess
without BaseLocalLauncher's restart loop, so cleanupAndExit() permanently
destroyed the session on mode switch. The web handoff button now correctly
changes control mode while keeping Pi alive.

* fix(pi): remove permission mode selector (Pi RPC has no runtime switching)

Pi's --mode rpc is non-interactive and auto-approves all tool execution;
there is no set_permission_mode command in the protocol. The selector
reported success without changing Pi's behavior, misleading users.

Remove the concept across all four packages:
- shared: getPermissionModesForFlavor('pi') returns [] (cascades to
  hub 400 + web UI auto-hide via length===0 guards); drop
  PI_PERMISSION_MODES / PiPermissionMode
- cli: strip permissionMode from PiSession/runPi/pi command/resume;
  drop the no-op SetSessionConfig permission branch that stored state
  without forwarding to the subprocess
- web: delete PiPermissionPanel.tsx; remove panel block + imports
  from HappyComposer

* fix(cli): realpath workspace root in apiMachine test assertion

The handler realpaths the cwd as a symlink-escape guard, so on macOS
/var/folders/... resolves to /private/var/folders/... The test compared
against the un-resolved path and failed on macOS. Use realpathSync on
the expected value for cross-platform consistency (no-op on Linux where
/tmp has no symlink prefix).

* fix(pi): keepalive reads current mode instead of constructor-time startingMode

The Switch handler updated controlledByUser but PiSession.pushKeepAlive()
still emitted the readonly startingMode every 2s, so a runner-started
session switched to local would flip back to remote on the next keepalive.

Replace readonly startingMode with a mutable mode field; add setMode()
that updates it and re-pushes keepAlive immediately. The Switch RPC
handler now calls setMode() before handleModeChange.

* fix(pi): runner no longer passes permission flags to Pi subprocess

After removing the Pi permission selector, the Pi command parser rejects
--permission-mode and ignores --yolo. But the shared buildCliArgs tail in
the runner still appended these flags for Pi sessions, making runner-
spawned Pi children exit before registering a session.

Guard the permission/yolo append with agent !== 'pi'.

* fix(pi): preserve provider identity when persisting selected Pi model

The hub's applySessionConfig normalized Pi's { provider, modelId } object
down to a plain modelId string for the shared session.model field, losing
the provider. On reload or next render, web's selectedPiModel lookup
matched by modelId alone — if two providers share a modelId, the wrong
one was highlighted, and subsequent model/thinking-level changes sent the
wrong provider to the Pi subprocess.

Add a provider-qualified piSelectedModel field to session metadata
(schema + persistPiSelectedModel mirroring persistPreferredPermissionMode).
Web's selectedPiModel now prefers the provider-qualified match and only
falls back to modelId-only matching when absent.

* fix(pi): model picker checkmark follows provider-qualified selection

selectedPiModel already resolves via provider+modelId, but the model
panel's currentPiModel still matched by modelId alone — so with two
providers sharing a modelId the checkmark pointed at the wrong row.
Reuse selectedPiModel directly.

* fix(pi): steer messages consumed immediately, not queued in pendingLocalIds

onUserMessage unconditionally pushed localId into pendingLocalIds, but a
steer (sent while piIsStreaming) does not start a new turn — so the
steer's localId was never drained by turn_start. The next normal prompt's
turn_start would consume the stale steer localId instead, leaving the
new prompt's bubble stuck in the queued bar.

Only queue localId for the prompt path. Steer path emits
messages-consumed immediately.

* fix(pi): clear stale thinking level when switching to non-reasoning model

The model-change effect early-returned when selectedPiModel.reasoning ===
false, leaving the previously-set effort (e.g. 'high') persisted on the
session. The UI hid the thinking picker for the non-reasoning model, but
the hub still forwarded the stale effort as set_thinking_level — with no
visible control to clear it.

Call onEffortChange(null) for non-reasoning models.

* fix(pi): return provider-qualified model in SetSessionConfig applied

The CLI handler returned only currentModel (bare string), so the hub's
applySessionConfig saw a non-object model and cleared
metadata.piSelectedModel via persistPiSelectedModel(session, null) —
undoing the provider that was just stored on the inbound config.

Return { provider, modelId } when both are known so the hub keeps the
provider-qualified metadata intact across active model changes.

* fix(pi): preserve piSelectedModel in bootstrapExistingSession metadata

The metadata whitelist rebuild kept piAvailableModels but omitted
piSelectedModel, so the first resume/local-handoff update dropped the
provider identity — after which web fell back to modelId-only matching
and could select the wrong provider for duplicate modelIds.

* fix(pi): await Pi confirmation before reporting model/effort applied

SetSessionConfig was fire-and-forget — transport.send wrote JSONL to
stdin and returned immediately. If Pi rejected an invalid provider/model
or thinking level, the hub still persisted the new value and the UI
reported success while Pi kept the old runtime state.

Use sendPiRpcAndWait so a failed set_model/set_thinking_level rejects
the web request and leaves the session config unchanged.

* fix(pi): resolve set_model RPC so awaited model switch does not time out

SetSessionConfig awaits sendPiRpcAndWait(set_model) before reporting the
model applied, but handleResponse's set_model branch updated state and
fell through without calling resolvePendingRpc. The pending RPC promise
then waited the full 10s timeout and rejected, making /sessions/:id/model
return 409 even though Pi accepted the change. Mirror every other branch
by resolving the pending RPC after updating currentModel/currentProvider.

* fix(pi): drain pending localId on turn_start only; throw when set_model suppressed

- loop.ts: split agent_start/turn_start branches. Pi emits both per prompt;
  draining on both popped the FIFO twice and shipped an undefined localId to
  the hub. agent_start now only sets thinking state; turn_start drains.
- runPi.ts: when set_model is suppressed (provider unknown), throw instead of
  silently returning applied, so the hub returns 409 rather than persisting a
  piSelectedModel Pi never received.
- loop.test.ts: assert agent_start does not drain; add regression test that a
  single turn drains exactly one real localId.

* fix(pi): exclude Pi from generic Ctrl/Cmd+M model cycler

SessionChat fed piModelOptions into HappyComposer.availableModelOptions,
so the global Ctrl/Cmd+M shortcut ran getNextModelForFlavor over the Pi
list and called onModelChange with a bare modelId string. Pi needs
{ provider, modelId } to disambiguate duplicate model IDs across
providers; a bare string made runPi fall back to the first cached
provider match (wrong provider) or throw when the provider was unknown.

Drop the piModelOptions useMemo and pass undefined for Pi, mirroring
modelOptions.ts where the Pi branch already returns the current model
unchanged (no-op) when no custom options are supplied. Pi model changes
now go only through the dedicated provider-qualified picker (piModels).

* fix(pi): commit PiSession config only after Pi confirms the RPC

SetSessionConfig previously mutated piSession.currentModel /
currentProvider / currentThinkingLevel BEFORE awaiting
sendPiRpcAndWait(set_model / set_thinking_level). When Pi rejected the
value or the RPC timed out, the handler threw and the route returned
409, but PiSession kept the unconfirmed values; the 2s keepalive then
reported them back to the hub, where handleSessionAlive persisted a
model/effort Pi never accepted.

Resolve the requested model/effort into locals first, send the RPCs,
and only commit to PiSession after each await resolves. The null
(clear-model) path needs no RPC so it still commits immediately; the
unknown-provider path still throws without committing.

* fix(pi): apply startup model only after Pi confirms set_model

Two startup paths persisted the requested --model before Pi confirmed it:

1. handleGetState set session.currentModel = session.initialModel as soon
   as get_state returned, using the unconfirmed startup model instead of
   Pi's actual default. If the model was unavailable or rejected, the 2s
   keepAlive reported it to the hub, which persisted/showed a model Pi
   never accepted.

2. get_available_models then sent set_model fire-and-forget, so a Pi
   rejection was never observed and currentModel stayed on the bad value.

Fix: handleGetState now reports Pi's real current model (newModel) while
a startup model is merely requested. get_available_models resolves the
provider from the cached list, awaits set_model, and commits
currentModel/currentProvider only on success — on rejection it logs and
keeps Pi's default. The await is fired detached so the
get_available_models RPC itself still resolves for ListPiModels.

* fix(pi): do not persist startup model before Pi confirms set_model

The startup --model still reached the hub unconfirmed via two paths the
previous Fix #13 left open:

1. bootstrapSession({ model: opts.model }) seeded the hub session model
   at creation time, and SessionCache.handleSessionAlive persists every
   non-undefined keepAlive model — so an unavailable/rejected model was
   stored and shown before get_available_models/set_model ran.
2. PiSession constructor set this.currentModel = opts.model, so the very
   first keepAlive (sent by startKeepAlive before any RPC confirms the
   model) reported the unconfirmed value.

Pass model: undefined to bootstrapSession and start PiSession.currentModel
at null; opts.model is still captured as initialModel and applied/committed
only after get_available_models confirms it exists and set_model succeeds
(Fix #13). The hub now sees Pi's real current model from the first
get_state keepAlive and switches to the requested model only once accepted.

Also add sendPiRpcAndWait contract tests pinning the await<->resolve
symmetry (Fix #10): set_model/set_thinking_level/get_available_models must
resolve before timeout on a success response, and reject on a Pi error.

* fix(pi): apply startup effort only after Pi confirms set_thinking_level

runPi restored opts.effort straight into piSession.currentThinkingLevel
before startKeepAlive ran, and pushKeepAlive persists effort — so a
resumed/runner-spawned session could store/show a thinking level Pi
rejected or ignored. This is the effort analog of the startup-model
confirmation contract (Fix #13/#14).

Capture the requested effort into a local startupThinkingLevel instead of
mutating currentThinkingLevel up front. After transport.start() and the
get_state/get_available_models/get_commands sends, await set_thinking_level
and commit currentThinkingLevel + push a keepAlive only on success; on
rejection keep Pi's default (already reported by get_state). The await is
detached so the run loop is not blocked, and get_state is sent before the
set so its authoritative baseline lands first and cannot clobber the
confirmed value.

* fix(pi): omit unknown runtime config from keepalive, don't clear persisted state

Fix #14 changed PiSession.currentModel to start at null so the startup
--model was not leaked before confirmation. But the hub treats keepAlive
model:null as an explicit clear (sessionCache.ts only skips when the
field is undefined), so the first heartbeat (startKeepAlive runs before
get_state) now erased a resumed Pi session's persisted model/effort
before Pi reported its real state.

Distinguish "unknown" from "clear": currentModel/currentThinkingLevel
start undefined and keepAlive omits undefined fields (via
getKeepAliveRuntime), so the hub leaves persisted values alone until Pi
confirms. null remains an explicit clear and is still forwarded. Once
get_state/set_model/set_thinking_level confirm a value it is set and
reported normally.

* fix(pi): disable Ctrl/Cmd+M model cycler for Pi entirely

Fix #11 removed piModelOptions from availableModelOptions, assuming
getNextModelForFlavor('pi', model, undefined) was a no-op. It is not:
the Pi branch returns normalizeCurrentModel(model), i.e. the current
modelId as a bare string, so the shortcut still called onModelChange with
a bare modelId. That loses the provider and can pick the wrong cached
match, clear the model when session.model is empty, or hit 'provider is
not yet known'. Short-circuit the handler for Pi so model changes go only
through the dedicated provider-qualified PiModelPanel.

* fix(pi): persist piSelectedModel from get_state and startup set_model paths

Pi stores session.model as the bare modelId and relies on
metadata.piSelectedModel ({ provider, modelId }) to disambiguate
duplicate modelId values across providers in the web picker and
thinking-level filtering. But piSelectedModel was only written by the web
/sessions/:id/model path (hub persistPiSelectedModel). The runtime paths
that set currentModel/currentProvider — get_state, the startup
get_available_models set_model, and the set_model response — only
keepAlive'd the bare modelId, so a Pi session on Pi's default model,
resumed from CLI, or started with --model had no provider identity in
metadata and could render/filter against the wrong provider.

Add persistSelectedPiModel(session) (no-op unless both fields are known)
and call it after get_state, after a successful startup set_model, and
after the set_model response updates the fields. This mirrors what the
web picker already does.

* fix(pi): default startingMode to remote — Pi has no local TUI path

A terminal `hapi pi` launch defaulted to startingMode 'local' and marked
the session controlledByUser, but Pi only runs as `pi --mode rpc` with
piped stdio — there is no local terminal/TUI input path like Claude/Codex
have. The terminal user could not drive the session and the web treated
it as local-controlled, so the first terminal Pi session was stuck until
manually switched from the web.

Default to 'remote' so the session is immediately drivable from the web.
An explicit opts.startingMode (runner path) still takes precedence.

* fix(pi): resume with remote startingMode — no local TUI path

The previous Fix #19 changed the `hapi pi` default to remote, but
`hapi resume` still passed startingMode: 'local' into runPi for Pi
sessions, re-introducing the same unsupported local-control state on the
resume path: setControlledByUser publishes controlledByUser while Pi has
no terminal/TUI input, hiding/rejecting remote-only controls until a web
switch. Pass 'remote' here too and update the resume test accordingly.

* fix: restore e2e/scratchlist.spec.ts deleted from main by mistake

The earlier "remove unrelated E2E harness" commit (d1e5b4c) deleted the
whole e2e/ directory this branch had added, but scratchlist.spec.ts is a
main-branch Playwright spec (the only file under playwright testDir
./e2e). Its removal left `bun run test:e2e` with no tests to run while
the script and playwright.config.ts still point at that directory.

Restore scratchlist.spec.ts from main; the unrelated harness files
(HARNESS.md, harness.*, integration/*.mts) that were genuinely
branch-only additions stay removed.

---------

Co-authored-by: pi <pi@local>
Co-authored-by: Claude <noreply@anthropic.com>
heavygee added a commit that referenced this pull request Jun 18, 2026
…dismissed (HAPI Bot, PR tiann#896)

The previous state machine swallowed the migration banner if the
operator reloaded the page before clicking dismiss: the migration flag
was set on success, and on remount the init logic mapped a
flag-set/dismiss-not-set session to 'pre-migrated', a state the banner
explicitly refuses to render. Net effect: a migrated session never
prompted for affirmative dismissal.

Fixes:

- Drop the 'pre-migrated' state. The dismissal flag is now the only
  signal that suppresses the banner; the migration flag alone means
  'banner shows until dismissed' (now or after a reload).
- Sessions that had nothing to migrate (no v1 entries in localStorage)
  pre-emptively write BOTH flags - migrated AND dismissed - so the bot's
  banner-stickiness fix doesn't surface a banner that has nothing to
  announce on freshly-created v2 sessions.

Tests:

- New `reload-before-dismiss leaves the banner visible` test pins the
  fix end-to-end: mount #1 migrates -> 'completed', unmount, mount #2
  on the same session reads the localStorage flags and stays
  'completed'.
- New `opts fresh sessions out of the banner pre-emptively` test pins
  the no-v1-entries shortcut.
- Existing `does not re-migrate on a mount where the migrated flag is
  already set` updated to assert 'completed' (not the dropped
  'pre-migrated').
- Existing `skips migration when localStorage is empty` updated to
  assert the new 'dismissed' status + the banner-dismissed flag.
- Banner test for the 'pre-migrated -> nothing' case removed (the state
  no longer exists).

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(voice): context formatters hardcode "Claude Code" for all agent flavors

1 participant